Picture for Chengming Zhang

Chengming Zhang

Event-VStream: Event-Driven Real-Time Understanding for Long Video Streams

Add code
Jan 22, 2026
Viaarxiv icon

SDiT: Semantic Region-Adaptive for Diffusion Transformers

Add code
Jan 18, 2026
Viaarxiv icon

GFormer: Accelerating Large Language Models with Optimized Transformers on Gaudi Processors

Add code
Dec 19, 2024
Figure 1 for GFormer: Accelerating Large Language Models with Optimized Transformers on Gaudi Processors
Figure 2 for GFormer: Accelerating Large Language Models with Optimized Transformers on Gaudi Processors
Figure 3 for GFormer: Accelerating Large Language Models with Optimized Transformers on Gaudi Processors
Figure 4 for GFormer: Accelerating Large Language Models with Optimized Transformers on Gaudi Processors
Viaarxiv icon

AdaCM$^2$: On Understanding Extremely Long-Term Video with Adaptive Cross-Modality Memory Reduction

Add code
Nov 19, 2024
Viaarxiv icon

SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training

Add code
Oct 20, 2024
Figure 1 for SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training
Figure 2 for SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training
Figure 3 for SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training
Figure 4 for SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training
Viaarxiv icon

DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

Add code
Oct 11, 2023
Figure 1 for DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies
Figure 2 for DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies
Figure 3 for DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies
Figure 4 for DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies
Viaarxiv icon

Benchmarking and In-depth Performance Study of Large Language Models on Habana Gaudi Processors

Add code
Sep 29, 2023
Figure 1 for Benchmarking and In-depth Performance Study of Large Language Models on Habana Gaudi Processors
Figure 2 for Benchmarking and In-depth Performance Study of Large Language Models on Habana Gaudi Processors
Figure 3 for Benchmarking and In-depth Performance Study of Large Language Models on Habana Gaudi Processors
Figure 4 for Benchmarking and In-depth Performance Study of Large Language Models on Habana Gaudi Processors
Viaarxiv icon

DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models

Add code
Sep 25, 2023
Figure 1 for DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models
Figure 2 for DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models
Figure 3 for DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models
Figure 4 for DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models
Viaarxiv icon

PapagAI:Automated Feedback for Reflective Essays

Add code
Jul 10, 2023
Figure 1 for PapagAI:Automated Feedback for Reflective Essays
Figure 2 for PapagAI:Automated Feedback for Reflective Essays
Figure 3 for PapagAI:Automated Feedback for Reflective Essays
Figure 4 for PapagAI:Automated Feedback for Reflective Essays
Viaarxiv icon

HEAT: A Highly Efficient and Affordable Training System for Collaborative Filtering Based Recommendation on CPUs

Add code
May 03, 2023
Figure 1 for HEAT: A Highly Efficient and Affordable Training System for Collaborative Filtering Based Recommendation on CPUs
Figure 2 for HEAT: A Highly Efficient and Affordable Training System for Collaborative Filtering Based Recommendation on CPUs
Figure 3 for HEAT: A Highly Efficient and Affordable Training System for Collaborative Filtering Based Recommendation on CPUs
Figure 4 for HEAT: A Highly Efficient and Affordable Training System for Collaborative Filtering Based Recommendation on CPUs
Viaarxiv icon